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MULTIPLE  CHANNEL  ADAPTIVE  FILTERING  USING 
A  FAST  ORTHOGONALIZATION  NETWORK: 

AN  APPLICATION  TO  EFFICIENT  PULSED 
DOPPLER  RADAR  PROCESSING 

I.  INTRODUCTION 

The  direct  adaptive  filtering  of  multiple  input  channels  by  Gram-Schmidt  orthogonalization  has 
been  the  subject  of  intense  research  during  the  past  decade  [1*14].  The  Gram-Schmidt  technique 
(sometimes  called  the  Adaptive  Lattice  Filter)  has  been  shown  to  yield  superior  performance  si¬ 
multaneously  in  arithmetic  efficiency,  stability,  and  convergence  times  over  other  adaptive  algorithms. 

Arithmetic  efficiency  has  been  especially  demonstrated  in  the  filtering  of  stationary  convariance 
sequences  [4-14].  The  stability  of  the  algorithm  is  enhanced  because  it  does  not  require  the  calculation 
of  an  inverse  convariance  matrix  as  does  the  Sample  Matrix  Inversion  (SMI)  algorithm  of  Reed,  Mal¬ 
let,  and  Brennan  [IS].  A  overview  of  Adaptive  Lattice  Filters  and  a  large  bibliography  on  this  subject 
are  contained  in  Ref.  S. 

In  adaptive  filtering,  it  is  desirable  to  find  the  optimal  weighting  of  multiple  input  channels  such 
that  the  output  signal  to  noise  power  ratio  (S/N)  is  a  maximum.  The  desired  signal  is  associated  with  a 
desired  signal  column  vector,  a,  where  a  -  (s1.*2, ....  s*)r,  N  is  the  number  of  input  channels,  and  T 
denotes  the  vector  transpose.  The  vector  component,  g,,ii-l,2, ....  N  represents  the  desired  signal’s 
component  in  the  nth  input  channel.  If  w  is  an  AMength  column  vector  denoting  the  optimal  weighting 
of  the  N  input  channels  and  x  is  an  N1  length  column  vector  denoting  the  data  from  the  N  input  chan¬ 
nels,  then  it  can  be  shown  [16]  that  w  must  satisfy  the  following  vector  equation: 


Rjo rW  ■ 

(1.1) 

R„  -  £(x*x*). 

(1.2) 

M  is  an  arbitrary  constant  which  for  convenience  we  set  equal  to  one,  £{}  denotes  the  expected  value, 
and  *  denotes  the  complex  conjugate.  Equation  (1.1)  is  often  referred  to  as  the  Applebaum  Adaptive 
Algorithm  [16].  The  matrix,  R„,  is  called  the  input  covariance  matrix. 

For  some  filtering  applications,  there  may  be  as  many  output  channels  as  there  are  input  channels 
(such  as  a  doppler  processor).  Hence,  there  will  be  N  desired  signal  vectors.  We  define  S  to  be  the 
NxN  steering  matrix  of  desired  signal  vectors;  i.e., 

S-  (s,sj...s*)  (1.3) 

where  s„.ji  -  1,2, ...  ,N  are  column  vectors  of  the  desirable  signals.  If  W  is  defined  as  the  optimal 
NxN  weighting  matrix,  i.e.,  the  weights  that  optimize  the  S/N  in  each  of  the  output  channels,  then 
these  weights  satisfy  the  following  matrix  equation 

R„  W-S*.  (1.4) 

Problems  occur  in  the  solution  for  the  weights  if  Ra  is  ill  conditioned.  Due  to  computational 
inaccuracies,  the  algorithm  can  become  unstable  and  the  output  channels  extremely  noisy.  Adaptive 
lattice  filtering  does  not  normally  exhibit  stability  problems. 
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la  fact  tbe  weights  formulated  by  Eq.  (1.4)  are  not  calculated  at  all  when  usfag  adaptive  lattice 
filtering.  The  data  in  the  input  channels  are  filtered  directly  through  an  orthogonalization  network  as  is 
demonstrated  in  the  following  sections.  However,  the  output  channels  will  have  tbe  same  or  better  (if 
is  ill  conditioned)  S/N  performance  in  each  output  channel  as  if  the  weights  were  calculated  exactly 
in  Eq.  (1.4)  and  applied  to  the  input  data  set 

Most  of  the  research  on  Adaptive  Lattice  Filters  has  concentrated  on  the  processing  of  stationary 
convariance  sequences  especially  in  the  area  of  discrete- time  linear  prediction  systems  (4-14).  This 
implies  that  if  rv  is  the  kj  element  to  R„,  then 

“  rt-j‘  (1.5) 


Hence,  the  convariance  matrix,  R„,  has  tbe  Toeplitz  form. 

In  this  report,  we  consider  the  efficient  processing  of  channels  that  are  not  neceasarfly  stationary 
with  respect  to  one  another  so  that  Eq.  (1.5)  is  not  necessarily  true.  However,  the  input  data  on  a 
given  channel  will  be  assumed  stationary  with  respect  to  other  data  in  that  channel  The  algorithm 
developed  in  the  following  sections  will  be  a  multichannel  adaptive  lattice  filter  which  is  structured  for 
arithmetic  efficiency  in  addition  to  retaining  the  good  stability  and  fast  convergence  properties  of 
orthogonalization  networks.  In  Section  vm,  we  apply  this  algorithm  (called  a  fast  orthogonalization 
network)  to  implement  an  arithmetical^  efficient  adaptive  pulae  doppler  radar  processor. 

II.  DECORRELATORS 

Consider  two  channels  of  complex  valued  data:  Xx  and  X2.  We  desire  to  form  an  output  channel, 
Y,  which  is  decorrebted  with  X2,  i.e., 

YXj*  —  0  (2.1) 


where  the  overbar  denotes  the  expected  value.  This  can  be  accomplished  as  follows.  Let  us  write 

Y-Xi-  wX2  (2.2) 


and  find  a  constant  weight,  w,  such  that  Eq.  (2.1)  is  satisfied.  It  can  be  shown  that 


w  — 


(2.3) 


where  I  •  I  denotes  the  complex  magnitude  function  and  *  denotes  the  complex  conjugate.  Figure  1 
represents  this  decorrelation  processor  (DP). 

In  a  digital  implementation  of  the  decorrelator,  samples  of  the  two  input  channels  would  be  taken 
and  the  weight  would  be  estimated.  Let  MTf(l),Jrf(2),...  ,  JT,(Nf)}  / -  1,2  denote  the  input  data 
sequences  for  Xx  and  Y2  where  Nt  is  the  total  number  of  discrete  samples  taken  per  channel.  Then  the 
decorrelation  weight  could  be  estimated  as 

", 

jXx(n)Xfin) 

w  -  - .  (2.4) 

^|jr2(e)|J 

Note  that  Eq.  (2.4)  does  not  account  for  changes  in  the  noise  environment.  If  the  noise  environment 
is  noostationary,  then  such  techniques  as  a  'sliding  window”  or  'forgetting  factor*  could  be  used  on  the 
input  data.  This  is  discussed  further  in  Section  V. 
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Fig.  1  —  Decorrelation  proceeaor  (DP) 

Let  us  consider  N  channels  of  complex  valued  data:  X\,Xit. . .  ,XN.  To  form  an  output  channel, 
T,  which  decorrelated  with  X2,  X2,  ...  ,  XN,  we  write 

r  -  JT,  -  w2X2  -  ...  -  wNXN.  (2.5) 

We  desire  to  find  the  weights,  wr,r— 2, 3 . ...  ,N such  that 

KT^-0  r  *»  2,3, ,  N. 


We  define  a  weight  vector,  w  —  (wltw2, 
tion  of  the  following  vector  equation: 


. . .  ,w/f)T  where  wt  =  1.  It  can  be  shown  that  w  is  the  solu- 


1 

0 


*zrw-M 


(2.6) 


Oj 


where  Rxx  is  the  NxATconvariance  matrix  of  the  input  channels,  i.e., 

/Jjcr-ElX'XT  (2.7) 

and  X  -  (XUX2.  ....  XN)T.  The  constant  n  is  not  arbitrary  but  chosen  so  that  w\  -  1. 

From  Eq.  (2.6),  it  is  seen  that  the  decorrelator  could  be  implemented  by  taking  data  samples, 
forming  a  sample  covariance  matrix  as  implied  by  Eq.  (2.7),  solving  Eq.  (2.6)  for  the  weights,  and 
applying  these  weights  to  the  input  channels. 

Another  implementation  of  this  decorrelation  process  is  called  Gram-Schmidt  (GS)  decomposition 
(1-5)  as  illustrated  in  Fig.  2  which  uses  the  basic  two-input  DP  as  a  building  block  (3).  GS  decomposi¬ 
tion  decorrelates  the  inputs  one  at  a  time  from  the  other  inputs  using  the  basic  two-input  DP  as  shown 
in  Fig.  1.  For  example  as  seen  in  Fig.  2,  in  the  first  stage  or  level  of  decomposition,  XN  is  decorrelated 
with  Xu  X2, ... ,  XN.\.  Next,  the  output  channel  which  results  from  decoireladng  XN  with  XN-\  is 
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INPUT  CHANNELS 
*1  **  *N 


ORTHOGONALIZATION 

NETWORK 


OUTPUT  CHANNELS 
Ffc.  3  ~  MnHIpto  rtannrt  adaptive 


where  Ra  is  defined  by  Eq.  (2.6)  and  /is  the  NxN identity  matrix.  Actually 

Rxx  -  S-^^Sr1  (3.2) 

where  t  denotes  conjugate  transpose.  The  steerinc  matrix,  S*,  has  been  transformed  into  a  steering 
matrix  which  is  the  identity  matrix. 


If  we  examine  the  new  desired  signal  vectors  (which  are  the  column  vectors  of  the  identity 
matrix),  the  nth  channel  has  a  desired  signal  vector 

(00...0100...0)r 

nth  position. 

Hence,  it  is  seen  that  the  form  of  Eq.  (3.1)  for  the  nth  channel  is  very  similar  to  Eq.  (2.6)  except  that 
the  ”1”  in  the  steering  vector  is  not  necessarily  in  thel  first  position  as  seen  in  Eq.  (2.6).  However,  to 
perform  the  decorrelation  process,  it  is  only  necessary  to  rearrange  the  input  channels  so  that  all  other 
channels  are  decorrelated  with  the  nth  input  channel,  XHt  as  seen  in  Fig.  4.  By  using  this  decorrelation 
procedure,  the  AT  channel  orthogonalization  network  seen  in  Fig.  3  is  now  defined. 

The  ordering  of  the  input  channels  for  decorrelation  as  seen  in  Fig.  4  was  arbitrary.  It  is  shown  in 
the  next  section  that  the  input  channels  can  be  ordered  so  as  to  greatly  reduce  the  required  number  of 
arithmetic  operations.  If  there  were  no  logic  behind  choosing  the  ordering  of  the  input  channels,  it 
could  be  shown  that  the  number  of  weights  that  are  calculated  by  using  this  decorrelation  procedure  is 
0 .51V2  (N  -  1).  In  the  following  section  we  develop  an  algorithm  which  requires  approximately 
l.SN(N  -  1)  weights  for  the  same  decorrelation  process. 

It  is  important  to  point  out  that  the  desired  signals  must  be  small  or  with  a  low  duty  cycle  with 
respect  to  the  noise  in  the  respective  channels.  Otherwise,  desired  signals  whose  vector  components 
are  slightly  different  from  the  steering  vector  components  will  be  cancelled  due  to  the  decorrelation 
process.  Radar  returns  sampled  in  range  are  a  good  example  of  a  low  duty  cycle  desired  signal  where 
targets  are  sparsely  distributed  across  the  range  bins. 


'•  ■  .v-  v-  y.  yv 


CHANNEL  1  CHANNEL  2  CHANNEL  3  t  •  •  CHANNEL  N 

OUTPUT  OUTPUT  OUTPUT  OUTPUT 

Fif.  4  —  An  arbitrary  multichannel  orthofonalization  network 

IV.  FAST  ORTHOGONALIZATION  NETWORK 

In  this  section,  we  present  a  methodology  of  configurating  the  two  input  decorrelation  processors 
(DPs)  to  synthesize  the  N-channel  orthogonalization  network  seen  in  Fig.  3  so  that  numerical  efficiency 
is  achieved.  To  this  end,  we  introduce  the  following  notation.  A  single  channel  decorrelator  will  be 
represented  as 

Chi  -  l*i.  X2,  X3 . X„)  (4.1) 

where  X2.  X3 . XN  are  decorrelated  from  X\  and  XN  is  decorrelated  first,  XN-\  is  decorrelated 

second,  and  so  on.  Figure  2  shows  the  structure  of  the  correlator.  The  channel  variable,  Ch\,  refer¬ 
ences  this  structure  to  channel  1  or  the  X\  channel.  The  X„,  n  —  1,  2 . N  will  be  called  the  ele¬ 

ments  of  the  structure. 

Numerical  efficiency  of  the  algorithm  to  be  presented  is  achieved  by  taking  advantage  of  redun¬ 
dancies  that  can  occur  for  two  different  decorrelator  structures.  For  example,  let  there  be  eight  chan¬ 
nels.  Channels  1  and  4  can  be  generated  as  follows: 

Chx  -  l*i.  *2-  *3.  *4.  Xs.  Xt,  Xlt  Xt ) 

Ch<-  IX4,  Xit  X2,  Xu  Xit  Xt,  Xlt  Xt).  (4.2) 

Note  that  CAi  and  CA4  have  the  same  four  input  channels  at  the  far  right.  In  the  actual  implementa¬ 
tion,  the  substructure  associated  with  these  four  rightmost  channels  can  be  shared  by  Chi  end  CA4  as 

illustrated  in  Fig.  S.  In  fact,  anytime  two  channels  have  exactly  the  same  far-right  channels  as  indicated 
by  the  decorrelator  structure,  the  substructure  associated  with  these  far-right  elements  can  be  shared  in 
the  implementation  process. 

For  convenience,  let  N  -  2".  In  general  we  can  configure  2m~i  output  channels  to  have  the  com¬ 
mon  substructure  of  2"_1  input  channels;  2m~2  output  channels  to  have  the  common  substructure  of 
2m~2  input  channels;  and  so  on.  Because  of  the  structuring,  the  total  number  of  weights  that  must  be 
calculated  will  be  approximately  proportional  to  N2.  A  further  discussion  of  the  number  of  arithmetic 
operations  is  given  in  the  next  section. 

The  following  algorithm  sequentially  generates  structures  which  can  be  implemented  in  a  numeri¬ 
cally  efficient  manner: 

STEP  1  Generate  root  structure:  Uf|,  X2 . X2J. 

STEP  2  Generate  a  structure  which  is  the  inverted  order  of  root  structure:  [2T]a>  ....  £]]. 
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STEP  1  Ur,,  jj,  X,.  Xs.  X*  Is,  Is.  Is 1  -  Cki 

STEP  2  Us.  Is.  Is.  Is.  Is.  Is.  Is.  X, )  -  CA, 

STEP  3  U*  X,.  X,.  X,.  X,.  Is.  X7,  X»1  -  Cl< 

U,.  Xs.  X7.  Xs.  Is,  Is.  Is,  X|)  -  Chs 
STEP  4  U,.  It.  Is.  Is.  Is.  Is.  h.  X.]  -  Chs 

Ms.  X..  Xs.  Is.  Is.  Is.  Is.  X]1  -  Chs 

U,.  Xs.  Xj,  X,.  X,.  Xs.  X7.  Xs]  -  CA, 

Ut.  X,.  X7.  X,.  X4.  X,.  X,.  Xjl  ■  Chs 

Note  that  channels  1,  2,  3,  4  have  the  substructure  associated  with  X,.  Xs.  X7.  X,,  and  that  channels 
5,  6,  7,  8  have  the  substructure  associated  with  X4.  X,.  X,.  X(.  Also  note  that  CAi  and  CA,  have  the 
same  6-element  substructure  as  do  the  channel  pairs:  (CA*  CA,),  (CAs.  CA,),  and  (CA,.  CAs).  A 
complete  realisation  of  the  8  output  channels  is  illustrated  in  Fig.  6. 


FlS.  6  —  Compute  nwttntioa  of  an  •ttht-chenne!  Past  Ortbosoaalsatkm  Natwoctc 
V.  NUMBER  OF  ARITHMETIC  OPERATIONS 

Each  two  input  decorrelation  processors  (DPs)  of  the  Fast  Orthogonal  isation  Network  QPON)  as 
depicted  in  Fig.  6  will  have  a  complex  weight  associated  with  it  The  number  of  DPs  or  complex 
weights  associated  with  a  PON  can  be  found  bjr  considering  die  number  of  DPs  at  each  level  of  the  net¬ 
work.  Prom  Fig.  6,  we  see  that  the  number  of  levels  equals  N  -  1.  If  I*  is  equal  to  the  number  of 
DPs  at  eecfa  level,  then  it  can  be  shown  that 
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Lt  -  2m+1  -21 
I2  -  2*+1  -2-2 
I, .  2"+1  -2-3 


Ljm-i  -  2"+1  -  2  •  2*-1  (5.1) 

I2-.+1  -  2m+l  -  22  •  1 
*2-.+2  -  2-+1  -  2J  •  2 


2*+i  _  22  ■  2"~2 


Lj<l,  -  2"+l  -  2"  - 1. 

Thus,  the  total  number  of  DPs,  NDP,  needed  for  a  FON  is  derived  by  adding  the  right-hand  sides 
of  the  above  system  of  equations.  It  can  be  shown  that 


Ndp-  y  N(N-  1)  -  |  ktgiN. 


(5.2) 


The  above  number  is  also  be  equal  to  the  total  number  of  complex  weights  associated  with  a 
FON.  The  total  number  of  operations  associated  with  a  FON  is  dependent  on  whether  the  algorithm  is 
implemented  (1)  to  recursively  update  the  weights  as  new  data  samples  arrive  or  (2)  to  calculate  the 
weights  as  a  function  of  a  block  of  N,  data  samples  in  each  of  the  N  channels. 

Case  I:  Recursive  Processing 

Let  w(k)  be  a  scalar  weight  associated  with  one  of  the  two  input  DPs  for  the  fcth  set  of  data  sam¬ 
ples  (k  —  1,  2.  ... ,  N,).  Also  let  uR(k),  Uiik)y  and  «0Ut(Ar)  be  the  complex  valued  scalar  right-side 
input,  left-side  input,  and  output  of  this  particular  DP  respectively  on  the  fcth  sample  as  seen  in  Fig.  7. 
In  this  block  diagram,  input  uR(k)  is  decorrelated  with  input  uL(k)  and  the  decorrelated  output  is 
called  umi(k).  Also,  we  define  two  complex  valued  scalar  state  variables  associated  with  this  DP: 
*»i(fc)  and  vj(fc).  We  see  from  Eq.  (2.4)  that  w(k)  can  be  updated  by  using  uR(k),  uL{k),  v\ (k),  and 
vj(k)  as  follows: 


vi(Ar)  *»  (1  —  a)  v\ (.k  —  1)  +  a  uL{k)uR  *{k) 
v2(k)  "  (1  —  a)  v2{k  -  1)  +  a  |ujj(A:)|2 


w(k)  - 


vi(k) 

v2(k) 


(5.3a) 

(5.3b) 

(5.3c) 


where  0  <  a  <  1  is  a  constant  which  controls  how  fast  past  data  are  forgotten.  This  forgetting  factor  is 
necessary  if  the  statistics  of  the  input  channels  are  time-varying. 
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The  output  of  the  specified  DP  has  the  form 

«out(*)  -  uL(k)  -  w(k)q*(k).  (5.4) 

By  inspecting  Bqs.  (5.3)  and  (5.4),  we  see  that  there  are  seven  complex  multiplication  operations 
(CMOPs)  and  one  complex  division  operation  (CDOP)  per  DP  per  data  step  or  iteration.  If  N&*0p  and 
are  the  number  of  CMOPs  and  CDOPs  per  iteration  respectively,  then  by  using  Eq.  (5.2),  we 
can  show 

N$>0r  “  10.51VW  -  1)  -  3.SN  tog2  N  (5.5) 

N^jp  -  1.5 N(N  -  1)  -  0.5 N  log2  N.  (5.6) 

Case  II:  Block  Processing 

For  block  processing,  the  total  number  of  CMOPs,  N^jp,  is  the  sum  of  the  number  of  CMOPs 
associated  with  finding  the  NDp  weights,  and  the  number  of  CMOPs  associated  with  processing  the  N, 
input  data  samples  per  channel  through  a  FON. 

It  can  be  shown  by  using  Eq.  (5.2),  inspection  of  Eq.  (2.4),  and  the  fact  that  each  DP  must  pro¬ 
cess  N,  data  points  that 

N<$0p  -  Ns(*.5  N(N  -  1)  -  1.5  N  logjJV).  (5.7) 

Since  only  one  CDOP  per  DP  is  used  in  the  block  processing  technique,  it  follows  that 

N£&p  -  1.5  N(N  -  1)  -  0.5  N  log2  N.  (5.8) 

Let  us  compare  the  arithmetic  efficiency  of  the  block-processed  FON  algorithm  with  the  Sample 
Matrix  Inversion  (SMI)  algorithm  [15].  For  the  SMI  there  are  CMOPs  needed  to  calculate  the 
sample  covariance  matrix,  /?**■,  and  approximately  N3/5  CMOPs  required  to  find  RfJ.  If  the  steering 
matrix  is  the  identity  matrix,  then  the  weighting  matrix  equals  Rja}.  Finally,  there  are  N2N,  CMOPs 
required  to  multiply  the  N  x  N  weighting  matrix  times  the  N  x  N,  input  data  matrix.  Thus,  if 
is  the  number  of  CMOPs  needed  to  implement  the  SMI  algorithm,  then 


(5.9) 
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Nfift-y  rf  +  lN,*/*. 

It  is  shown  in  Ref.  15  that  if  the  input  channels  are  zero  mean  gaussian  processes,  then  the  aver* 
age  of  the  output  S/N  will  be  within  3  dB  of  the  optimum  S/N  after  2N  data  samples  per  channel. 
Thus,  if  we  set  N,  -  IN,  then  for  N  »  1 


N&o r-9N3 

(5.10) 

NM“4.33Nj. 

(5.11) 

Hence  we  see  that  the  FON  algorithm  is  about  half  as  fast  as  the  SMI  algorithm  in  attaining  good  S/Ns 
(within  3  dB  of  the  optimum).  However,  we  note  that  the  FON  algorithm  does  not  require  the 
inversion  of  a  matrix  which  can  lead  to  numerical  instabilities  if  the  sample  covariance  matrix  is  ill- 
conditioned. 

VI.  PARALLEL  PROCESSING 

From  Fig.  6,  it  is  seen  that  there  are  exactly  N  —  1  levels  of  DPs  associated  with  a  FON.  From 
Eq.  (5.1),  we  observe  that  the  maximum  number  of  DPs  per  level  is  2N  -  2  and  the  minimum 
number  is  N.  For  the  block  processing  algorithm  described  in  the  previous  section  we  see  that  as  the 
data  (all  NtLk-\  data  points)  are  processed  through  the  Ath  level  that  the  input  data  may  be  discarded 
and  the  output  data  (N,Lk  data  points)  become  the  new  input  data  set  Hence  2N  -  2  parallel  DPs 
could  be  configured  as  seen  in  Fig.  8. 


Fig.  S  —  Parallel  proce taint  architecture  of  the  Fast 
Orthogonalization  Network  (FON) 

We  allow  these  2N  -  2  DPs  to  simultaneously  perform  all  of  the  two-input  DPs  at  a  given  level 
as  shown  in  Fig.  6.  However,  for  this  to  occur  there  must  be  a  routing  algorithm  whose  function  it  is 
to  see  that  the  L*_ >  input  data  channels  are  inputted  into  the  proper  DPs.  From  Fig.  8,  we  see  that  the 
output  data  are  stored  back  into  the  input  data  memory  bank.  After  sequencing  N  -  1  times  through 
the  DPs,  the  algorithm  is  finished. 

From  inspecting  Eq.  (2.4),  it  is  observed  that  the  processing  time  through  a  DP  is  proportional  to 
the  number  of  data  points  per  input  channel,  N,.  Since  the  bank  of  2N  -  2  DPs  is  used  N  -  1  times, 
the  total  processing  time,  7paxa>  using  a  parallel  architecture,  is  approximately  proportional  to  the 
number  of  input  channels  and  the  number  of  data  points  per  channel;  that  is, 


Hence,  parallel  processing,  by  using  the  architecture  seen  in  Fig.  8,  can  significantly  decrease  the 
processing  time.  This  reduction  occurs  because  of  the  inherent  structure  of  the  FON.  In  addition,  the 
number  of  parallel  DPs  required  is  2/V  —  2  which  reduces  the  hardware  requirements. 

VII.  SOFTWARE  ALGORITHM 

A  software  algorithm  called  the  Fast  Orthogonalization  Network  (FON)  algorithm  has  been  de¬ 
vised  which  generates  the  2"  decorrelator  output  channels  from  the  2"  input  channels  by  using  the 
common  substructures  of  the  various  channels  as  described  in  Section  IV  (note  that  N  —  2").  Let  each 
of  the  2*  input  channels  have  N,  sample  points.  Thus  in  the  nth  channel,  2f„(l),  X„(2),  ....  X„(N,) 
are  observed. 

The  algorithm  requires  at  various  points  to  reduce  2*  input  channels  to  2*-1  output  channel 
through  a  partial  orthogonization  as  illustrated  in  Fig.  9.  The  2*-1  rightmost  input  channels  are 
decorrelated  with  the  2*-1  leftmost  input  channels.  If  UU,  J)  are  the  input  samples  where 
/  -  1,  2,  ....  2*  indicates  the  channel  number  and  7*1,  2,  . . . ,  N,  is  the  sample  index,  the  follow¬ 
ing  software  algorithm  called  the  Acth  Order  Partial  Orthogonalization  algorithm  generates  the  desired 
2*"'  output  channels: 

1.  Set  F<0)(7,  /)  -  UU,  /);  /  -  1.  2,  ... .  2*  /  -  1,  2 . N, 

2.  Set  A -I.  /0-2* 

3.  Calculate  recursively 

V<*Hl,  J)  -  F<*-»a  /)  -  K<*-,)(7,  /«)»**></) 

where 

T  o) 

**»>(/)  -  £r±  m - 

|K<*-I)(M/0)|J 

/-  1,  2 . 2*- A;  7-1,  2,  ...,  N, 

4.  Set  A  -  A  +  1.  Jo  -  Jo  -  1 

5.  If  A  <  2*-*  00  TO  3 

6.  end 

The  final  outputs  are  contained  in  K<2*-1)(/,y)- 

The  FON  algorithm  requires  that  the  input  channels,  Xu  X2,  ....  XN,  be  commutated  so  that 
the  final  output  channels  of  the  FON  are  properly  aligned.  (This  problem  also  occurs  with  the  FFT 
algorithm,  but  a  commutation  algorithm  matches  the  proper  output  channel  with  the  input  channel.) 
By  properly  aligned,  we  mean  that  the  Ath  output  channel  of  the  PON  algorithm  is  decorrelated  with 
input  channels:  1,  2,  ....  k  -  1,  k  +  1,  ....  N.  The  following  algorithm  called  the  Commutated 
Indexing  algorithm  computes  the  commutated  indices  and  stores  them  in  an  AT  element  array  called 
INDEX. 


NRL  RETORT  040 


2k  INPUT  CHANNELS 


Fig.  9  —  Partial  orthogonalization,  2 *  input  channels 
to  2*~'  output  channels 

1.  Set  INDEX  (1)  -  1,  INDEX  (2)  -  2 

2.  Set  /  -  1 

3.  Set  INDEX  (2J  +  k)  -  INDEX  (*)  +  3  •  2J~ »;  k  -  1 . 2'"1 

4.  Set  INDEX  ( 2J  +  k)  -  INDEX  (*)  +  2J~ l;  *  -  2'"1  +  1,  ....  2' 

5.  /  -  J  +  1 

6.  If  7  <  m  GO  TO  3 

7.  end 

After  defining  these  preliminary  algorithms,  we  now  give  the  complete  FON  algorithm: 

1.  Input  XU,  J)  array;  J  -  1,  2 . 2m,  I  —  1 ,  2 . N, 

2.  Calculate  the  commutated  indices  by  using  the  Commutated  Indexing  algorithm. 

3.  Transfer  XU,  INDEX (y))  —  YU,  J ) 

y-  1,  2 . 2m,  7—1,2 . N, 

4.  Set  k  —  m 

5.  Set  L  -  0 

13 
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6.  Transfer  YU.  J  +  2*1)  —  UU,  J) 

y-  1.  2.  ....  2*;  /-  1.  2 . IV, 

7.  Calculate  7);  /  —  1,  2,  ....  2*-J,  /  —  1.  2,  ....  AT,  by  using  the  *th  Order 

Partial  Orthogonaiization  algorithm. 

8.  Transfer  F(J*'1)(/.  7)  —  TU,  J) 

/-  I.  2.  ....  2*-‘.  /-  1,  2.  ....  IV, 

9.  Transfer  F(7.  2*1  +/)  -  £/(/.  2*  +  I  -  /) 

7  -  1.  2 . 2*;  7—1,2 . IV, 

10.  Calculate  F°*-1)(7,  y);  y-  1,  2,  ....  2*_t;  7  —  1,  2,  ....  IV,  by  using  the  *th  Order 
Partial  Orthogonaiization  algorithm. 

11.  Transfer  TU.  J )  —  YU.  2*1  +  y) 

y-  1.  2 . 2*-'.  /-  1,  2 . IV, 

12.  Transfer  Fa*",)(/,  y)  —  F(7.  2*7,  -I-  2*"1  +  7) 

y  -  1.  2 . 2*;  7  -  1.  2 . IV, 

13.  L-L  +  l 

14.  If  I  <  2-~*  GO  TO  6 

15.  *  -  *  -  1 

16.  If  *  >  0  GO  TO  5 

17.  end 

VIII.  AN  APPLICATION:  ADAPTIVE  DOPPLER  PROCESSING 

Adaptive  filtering  can  be  applied  to  radar  doppler  filter  design  (17,18).  Doppler  filters  are 
designed  to  accept  doppler  frequency  shifted  moving  targets  while  rejecting  the  returns  from  the  target 
background  (clutter).  The  clutter  is  usually  slow  moving  so  that  its  energy  is  normally  concentrated 
about  the  zero  doppler  frequency.  A  bank  of  filters  is  used  to  cover  the  entire  doppler  band;  i.e.,  the 
doppler  band  is  equally  divided  into  subbands.  Ideally,  it  would  be  desirable  to  place  a  rectangular 
bandpass  filter  about  each  subband  so  that  the  large  clutter  return  is  completely  rejected  out  of  band. 
However,  only  approximations  of  this  rectangular  filter  are  realizable.  It  has  been  shown  [17,18]  that 
adaptive  doppler  processing  yields  superior  signal-to-clutter  power  ratio  improvement  performance  over 
these  approximate  rectangular  filter  implementations.  This  results  because  each  doppler  filter  is 
designed  not  only  to  accept  a  desired  signal  but  also  to  place  nulls  at  frequencies  out  of  band  where 
clutter  returns  exist.  Each  doppler  filter  is  optimized  with  respect  to  the  doppler  filter's  allocated  sub¬ 
band  by  use  of  the  Applebaum  algorithm. 

The  input  channels,  x„,  n  -  1,  2 . N,  are  formed  by  taking  time-delayed  samples  (usually 

one  pulse  repetition  interval  (PRI)).  Hence  if  rU)  is  the  received  radar  signal,  then 


$ 


4 


x,-rU-  hT),  ii-  1,  2 . AT.  (8.1) 

There  ere  AT  weights  associated  with  each  of  the  N  doppler  filters.  These  N2  weights,  W,  are  the 
solution  of  the  following  matrix  equation: 

R„W-S*  (8.2) 

where  R„  is  the  covariance  matrix  of  the  AT  input  channels.  If  the  input  data  are  statistically  stationary 
in  time,  then  R„  is  a  Toeplitz  matrix.  The  matrix  S  is  the  matrix  of  steering  vectors  signifying  the 
various  doppler  filter  subbands.  In  general,  the  steering  matrix  has  the  following  form: 


«i 

«• 

•  •  *  a\ 

«3r  n 

.  .  .  ajr#*-» 

flj 

«3rj& 

ejT  0>» 

.  .  . 

on 

-xrjfif-1 

aMrV»-» 

.  .  •  aNTip-x)iN-x) 

where  rN  —  exp  [-J  2w/N)  and  J  —  V-T.  For  the  Brennan  and  Reed  doppler  processing  algorithm 
(17],  an  »  1,  n  -  1,  2,  . . . ,  AT.  For  this  algorithm,  each  of  the  N  doppler  filters  is  optimized  at  one 
particular  doppler  frequency  by  use  of  the  Appiebaum  algorithm.  The  particular  doppler  frequency  is 
chosen  to  be  at  the  center  of  the  given  subband.  However,  because  in  general  the  doppler  shift  is  un¬ 
known  within  a  given  subband,  any  desired  signal  whose  doppler  is  not  at  the  center  of  one  of  these 
subbands  will  not  be  properly  matched  and  signal  detection  will  degrade  especially  at  dopplers  that  are 
close  in  to  the  clutter  spectrum. 

Andrews  (18]  devised  a  steering  matrix  that  gives  superior  performance.  For  this  algorithm  the 
doppler  shift  is  assumed  unknown  across  a  particular  subband,  and  the  filter  response  is  optimized  over 
the  entire  subband.  In  essence,  the  best  Appoint  finite  impulse  response  (FIR)  filter  is  fitted  to  a 
desired  rectangular  filter  centered  in  a  particular  subband  with  a  phase  bandwidth  of  2w/N.  Andrews 
shows  that  these  weights  are  given  by 


w  N  +  1 
sm  jf  n  j 


ir 

AT  ■" 


JV+1 

2 


,  it  —  1,  2,  ... ,  N. 


The  form  of  S*  as  given  by  Eq.  (8.3)  is  such  that  it  can  be  factored  as 

S*  -  AB  (8.5) 

where  A  is  a  diagonal  matrix  with  diagonal  elements,  a„  *,  n  —  1,  2,  ...  AT  and 

B  -  (I*(-,,(M));  n,  I  -  1. 2 . N.  (8.6) 

If  we  employ  the  multiple  channel  adaptive  lattice  filter  by  using  a  PON  as  depicted  in  Fig.  3,  we 
see  that  the  N  input  data  channels,  xi.  x2,  ....  xN,  are  transformed  by  S*~'  -  B~'A-1.  Due  to  the 
special  form  of  the  matrix,  B,  it  can  be  shown  that 
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B-'  -  B" 


Therefore,  the  N  output  channels,  X—  (Xlt  X2,  ,  XN)T,  that  result  by  multiplying  the  N  input 

channels  by  S*-1  are  given  by 

X  -  B*A_,x.  (8.8) 

Equation  (8.8)  indicates  that  the  input  data  channels,  x,  are  first  weighted  by  the  inverse  of  the  diago¬ 
nal  matrix  A,  which  is  equivalent  to  weighting  the  nth  channel  by  l/a„,  «  —  1,  2,  ....  N.  The 
weighted  channels  are  then  multiplied  by  the  matrix,  B*.  It  can  be  shown  that  the  transformation  that 
results  by  multiplying  a  set  of  N  channels  by  B*  can  be  implemented  by  using  a  fast  Fourier  transform 
(FFT)  if  N  —  2m.  The  covariance  matrix,  /?„,  is  Toeplitz.  However,  note  that  the  output  channels  of 
the  FFT  are  not  stationary  with  respect  to  each  other;  i.e.,  Rxx  is  not  a  Toeplitz  matrix.  The  W  outputs 
of  the  FFT  are  then  processed  by  using  the  Fast  Orthogonalization  Network  discussed  in  the  previous 
sections.  Figure  10  is  a  simplified  diagram  of  this  processor. 


DOPPLER  BINS 

Fig.  10  —  Adaptive  doppler  processor 
using  a  lattice  Alter 

Let  the  input  data  set  have  the  form  depicted  in  Fig.  1 1.  Here  the  input  data  are  arranged  so  that 
the  returns  from  sequential  range  cells  are  sequential  samples  of  a  given  channel  and  the  data  in  each 
channel  at  a  given  range  are  the  time-delayed  (one  PRI)  return  of  the  preceding  channel.  In  this 
figure,  there  are  N,  range  bins  with  Ro  the  minimum  range  considered  and  AR  is  the  range  resolution. 
Returns  from  a  given  range  bin  occur  one  per  PRI  time  step.  If  N  PRIs  are  in  the  processing  time  win¬ 
dow,  then  the  returns  in  the  nth  PRI  form  the  nth  input  channel;  i.e.,  rmn  is  the  radar  return  from  the 
mth  range  cell  in  the  nth  PRI  interval.  If  this  input  data  set  is  block  processed  by  using  the  adaptive 
doppler  processor  as  seen  in  Fig.  10,  then  the  output  data  set  will  have  the  form  depicted  in  Fig.  11. 
This  matrix  of  output  data  will  have  elements,  r'm,  corresponding  to  the  returns  in  a  given  range- 
doppler  bin.  If  blocks  of  input  data  are  sequentially  processed,  then  the  resultant  output  data  sets  can 
be  inputted  into  a  postdetection  processor  for  the  detection  and  tracking  of  targets. 
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